Distance functions for categorical and mixed variables

نویسندگان
چکیده

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Gower distance-based multivariate control charts for a mixture of continuous and categorical variables

Processes characterized by high dimensional and mixture data challenge traditional statistical process control charts. In this study, we propose a multivariate control chart based on the Gower distance that can handle a mixture of continuous and categorical data. An extensive simulation study was conducted to examine the properties of the proposed control chart under various scenarios and compa...

متن کامل

Covariance and PCA for Categorical Variables

Covariances from categorical variables are defined using a regular simplex expression for categories. The method follows the variance definition by Gini, and it gives the covariance as a solution of simultaneous equations. The calculated results give reasonable values for test data. A method of principal component analysis (RS-PCA) is also proposed using regular simplex expressions, which allow...

متن کامل

Distance-based and probabilistic record linkage for re-identification of records with categorical variables

Record linkage methods are methods for identifying the presence of the same individual in different data files (re-identification). This paper studies and compares the two main existing approaches for record linkage: probabilistic and distance-based. The performance of both approaches is compared when data are categorical. To that end, a distance over ordinal and nominal scales is defined. The ...

متن کامل

A semi-supervised regression model for mixed numerical and categorical variables

In this paper, we develop a semi-supervised regression algorithm to analyze data sets which contain both categorical and numerical attributes. This algorithm partitions the data sets into several clusters and at the same time fits a multivariate regression model to each cluster. This framework allows one to incorporate both multivariate regression models for numerical variables (supervised lear...

متن کامل

Categorical Variables in Dea

If a DEA model has a mix of categorical and continuous variables a standard LP formulation can still be used by entering all combinations of categorical and continuous variables as different types of inputs and/or outputs. Most units will then not have positive levels of all variables. The implications for selection of peers are investigated. Peers can have the same or fewer types of inputs tha...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Pattern Recognition Letters

سال: 2008

ISSN: 0167-8655

DOI: 10.1016/j.patrec.2008.01.021